Introduction to Synthetic Control

Lee Kennedy-Shaffer, PhD

2024-06-18

Motivating Example: State Health Policies

Legislative Mandate for Tobacco Control — Proposition 99

CA Prop 99 in 1988:

Raised cigarette taxes

Funding for health ed

Logo for Ohio Vax-A-Million

OH May 2021:

Lottery for vaccinated people

Followed by other states

Question

What is the effect of these policies on cigarette sales and vaccine uptake, respectively?

Synthetic Control Method

Panel Data/Comparative Case Study

Generally requires:

  • One (or few) treated units

  • Many untreated units

  • Long pre-treatment history of outcomes for all units

  • Post-treatment outcomes for time period of interest

Idea

Find a weighted average of the control units that best approximates the pre-treatment history of the treated unit(s).

This is the synthetic control unit, which is then compared to the treated unit’s outcomes in the post-treatment period.

Example of Idea

Citation for Abadie et al. (2010)

Line plot showing cigarette sales per capita from 1970 to 2000 in California and the Rest of the US. California's line is falling more rapidly than that of the US, both before and after the break at 1988, creating a large and widening gap.

Abadie et al. (2010), Figure 1

Example of Idea

Citation for Abadie et al. (2010)

Line plot showing cigarette sales per capita from 1970 to 2000 in California and the Rest of the US. California's line is falling more rapidly than that of the US, both before and after the break at 1988, creating a large and widening gap.

Abadie et al. (2010), Figure 1

Line plot showing cigarette sales per capita from 1970 to 2000 in California and Synthetic California. The two lines nearly coincide prior to 1988 and diverge thereafter, with California's line falling more rapidly than that of synthetic California.

Abadie et al. (2010), Figure 2

Key Requirements

Title of Abadie (2021)

Title for Bonander et al. (2021)

Key Requirements

  • No anticipation
  • No spillover
  • Suitable control units: “stable weights”
  • Convex hull: non-extreme treated unit(s)
  • Effect size larger than routine fluctuations
  • Appropriate time horizon

Formal Specification

\[ \hat{\theta}_t = Y_{1t} - \sum_{i=1}^n w_i Y_{0it}, \]

where \(Y_{1t}\) is the outcome of the treated unit in period \(t\) and \(Y_{0it}\) is the outcome of the \(i\)th untreated unit in period \(t\).

The weights \(w_1, \ldots, w_n\) are nonnegative and sum to 1.

Minimizing Pre-Trend Difference

In the simplest form, the weights are chosen to minimize:

\[ \sum_{t=1}^{T-1} \left( Y_{1t} - \sum_{i=1}^n w_i Y_{0it} \right)^2, \]

where \(T-1\) is the last period for which the treated unit is pre-intervention.

Incorporating Covariates

Covariates can be incorporated in the weight minimization. For covariates (including pre-treatment outcomes) labelled \(k=1,\ldots,K\), choose a weight vector \(w\) that minimizes:

\[ \sum_{k=1}^K v_k \left( X_{1k} - \left(\sum_{i=1}^n X_{0ik} w_i \right) \right)^2, \]

where \(v_k\) are weights on the importance of each covariate, which can themselves be chosen to minimize the pre-intervention difference or by cross-validation on a split sample of pre-intervention times.

Weights

  • Do not use post-intervention outcomes

  • Generally sparse

  • Avoid extrapolation, permit interpolation

  • Transparent and (somewhat) interpretable

Table showing the weights for the synthetic California. The only non-zero weights are Colorado (0.164), Connecticut (0.069), Montana (0.199), Nevada (0.234), and Utah (0.334)

Abadie et al. (2010), Table 2

Estimand

Important

The estimand is again the average treatment effect on the treated (ATT): the effect of the policy on the treated unit compared to if it had not been treated.

The choice of time and scale for the comparison can be made by the investigator based on subject-matter knowledge.

Gap Plot

Abadie et al. (2010), Figure 3

Abadie et al. (2010), Figure 3

Robustness and Inference

Specification Tests

The pre-intervention mean squared prediction error (MSPE) of the SC fit can be used to assess fit.

To test robustness of results, can change:

  • Control units

  • Time frame considered

  • Covariates used

Assessing Fit

Abadie et al. (2010), Table 1

Abadie et al. (2010), Table 1

Placebo Test In-Space

Conduct the SC analysis with the same specifications for each control unit, excluding the treated unit. This gives a null distribution of estimates.

Can exclude those with much higher pre-intervention MSPEs.

Placebo Test In-Space

Plot of the gap between actual values and synthetic control values for California and 38 placebo test states. The California gap is near the extreme low end of the distribution for all post-intervention time points.

Abadie et al. (2010), Figure 4

Placebo Test In-Space

Abadie et al. (2010), Figure 6

Abadie et al. (2010), Figure 6

Placebo Test Estimators

Visual inspection of the observed result compared to the null distribution can follow. Or a specific estimator can be used to conduct a hypothesis test.

Common choices are:

  • First-period effect
  • Average effect over all included post-treatment periods
  • Post-treatment root mean square prediction error (RMSPE)
  • Post-treatment RMSPE/Pre-treatment RMSPE

Placebo Test In-Time

Can also run the analysis on dummy intervention time points.

Note

This is similar to the cross-validation approach sometimes used to select covariate weights.

If using both, interpret with caution.

Placebo Test In-Outcome/Population

In some cases, non-affected outcomes or populations may be available. These can be used as a null control or distribution.

Title for Bruhn et al. (2017)

Recommendations

Recommendations

  • Ensure ATT is appropriate

  • Consider trade-offs: more vs. fewer units, more vs. fewer time periods, interpolation vs. extrapolation, etc.

  • Pre-specify analyses: control units, covariates, years, placebo tests, MSPE restrictions, etc.

  • Run robustness checks wherever possible

  • Interpret results in context

Questions?